Value-Aware Loss Function for Model-based Reinforcement Learning
نویسندگان
چکیده
We consider the problem of estimating the transition probability kernel to be used by a model-based reinforcement learning (RL) algorithm. We argue that estimating a generative model that minimizes a probabilistic loss, such as the log-loss, is an overkill because it does not take into account the underlying structure of decision problem and the RL algorithm that intends to solve it. We introduce a loss function that takes the structure of the value function into account. We provide a finite-sample upper bound for the loss function showing the dependence of the error on model approximation error, number of samples, and the complexity of the model space. We also empirically compare the method with the maximum likelihood estimator on a simple problem. Proceedings of the 20 International Conference on Artificial Intelligence and Statistics (AISTATS) 2017, Fort Lauderdale, Florida, USA. JMLR: W&CP volume 54. Copyright 2017 by the author(s). Value-Aware Loss Function for Model-based Reinforcement Learning
منابع مشابه
Value-Aware Loss Function for Model Learning in Reinforcement Learning
We consider the problem of estimating the transition probability kernel to be used by a model-based reinforcement learning (RL) algorithm. We argue that estimating a generative model that minimizes a probabilistic loss, such as the log-loss, might be an overkill because such a probabilistic loss does not take into account the underlying structure of the decision problem and the RL algorithm tha...
متن کاملSafety-aware Adaptive Reinforcement Learning with Applications to Brushbot Navigation
This paper presents a safety-aware learning framework that employs an adaptive model learning method together with barrier certificates for systems with possibly nonstationary agent dynamics. To extract the dynamic structure of the model, we use a sparse optimization technique, and the resulting model will be used in combination with control barrier certificates which constrain feedback control...
متن کاملRRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features
Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the rank...
متن کاملForecasting the Tehran Stock market by Machine Learning Methods using a New Loss Function
Stock market forecasting has attracted so many researchers and investors that many studies have been done in this field. These studies have led to the development of many predictive methods, the most widely used of which are machine learning-based methods. In machine learning-based methods, loss function has a key role in determining the model weights. In this study a new loss function is ...
متن کاملUncertainty-Aware Reinforcement Learning for Collision Avoidance
Reinforcement learning can enable complex, adaptive behavior to be learned automatically for autonomous robotic platforms. However, practical deployment of reinforcement learning methods must contend with the fact that the training process itself can be unsafe for the robot. In this paper, we consider the specific case of a mobile robot learning to navigate an a priori unknown environment while...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017